Skip to content

Conversation

@njzjz
Copy link
Member

@njzjz njzjz commented Jan 7, 2026

Summary by CodeRabbit

  • Bug Fixes

    • Enforced strict length consistency when pairing related sequences across model runtime, data utilities, and inference paths to fail fast on mismatches.
  • Tests

    • Hardened tests to require sequence-length parity and added dtype checks, improving reliability and clearer failures.
  • Chores

    • Updated lint configuration to include an additional rule identifier for consistent code quality.

✏️ Tip: You can customize this high-level summary in your review settings.

Copilot AI review requested due to automatic review settings January 7, 2026 17:49
@dosubot dosubot bot added the build label Jan 7, 2026
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 7, 2026

📝 Walkthrough

Walkthrough

This PR makes zip() usages explicit across the codebase by adding strict=True (with a few deliberate strict=False exceptions) and updates Ruff linting to include B905 with per-file ignores for some deepmd paths. It enforces iterable-length checks where zip() is used.

Changes

Cohort / File(s) Summary
Configuration & Linting
pyproject.toml
Added B905 to Ruff select and extended per-file-ignores to include deepmd/pt/** and deepmd/pd/** for B905.
Core inference & eval
deepmd/calculator.py, deepmd/dpmodel/infer/deep_eval.py, deepmd/jax/infer/deep_eval.py, deepmd/tf/infer/deep_eval.py
Replaced plain zip(...) with zip(..., strict=True) when pairing type maps, outputs and results to enforce exact length alignment.
Model internals & nlists
deepmd/dpmodel/atomic_model/linear_atomic_model.py, deepmd/dpmodel/descriptor/hybrid.py, deepmd/dpmodel/utils/nlist.py
Added strict=True to zips pairing rcuts/nsels, descriptor indices, and neighbor-list constructions; raises on mismatches.
TensorFlow descriptors
deepmd/tf/descriptor/...
loc_frame.py, se_a.py, se_a_ef.py, se_atten.py, se_r.py, se_t.py
Switched zip(...) calls to zip(..., strict=True) in compute_input_stats and related loops; one helper call also receives strict=True.
Training / entrypoints / utils
deepmd/tf/nvnmd/entrypoints/mapt.py, deepmd/tf/train/trainer.py, deepmd/utils/batch_size.py, deepmd/utils/data_system.py, deepmd/utils/model_branch_dict.py, deepmd/utils/update_sel.py
Use zip(..., strict=True) for dict construction, batching, table rendering, and system pairing; update_sel.py intentionally uses strict=False in one branch with a TODO.
Tests — general / consistent
source/tests/consistent/*, source/tests/consistent/io/test_io.py, source/tests/consistent/fitting/test_dipole.py, source/tests/common/dpmodel/test_pairtab_atomic_model.py
Tests updated to use zip(..., strict=True) for element-wise comparisons and result parity; test_io.py refactored to split atomic vs non-atomic results and compare four sequences with strict zips.
Tests — PaddlePaddle (pd)
source/tests/pd/...
Added strict=True to zips in pd tests enforcing length parity in comparisons.
Tests — PyTorch (pt)
source/tests/pt/...
Updated test zips to strict=True to enforce matching lengths in paired comparisons.
Tests — TensorFlow (tf)
source/tests/tf/...
Setup loops and mapping dict constructions switched to zip(..., strict=True) to enforce exact pairing of pbtxts, graph dirs, and test value lists.

Sequence Diagram(s)

(No sequence diagrams generated.)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested reviewers

  • iProzd
  • wanghan-iapcm
🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 65.79% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: enabling the B905 Ruff linting rule to enforce strict zip usage, which is the primary objective reflected across all modified files.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

📜 Recent review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f82cffe and 94d2e05.

📒 Files selected for processing (1)
  • source/tests/pt/test_adamuon.py
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2026-01-10T04:28:11.377Z
Learnt from: OutisLi
Repo: deepmodeling/deepmd-kit PR: 5130
File: deepmd/pt/optimizer/adamuon.py:40-109
Timestamp: 2026-01-10T04:28:11.377Z
Learning: The coefficients (3.4445, -4.7750, 2.0315) in the `zeropower_via_newtonschulz5` function in `deepmd/pt/optimizer/adamuon.py` are standard across AdaMuon implementations and should not be modified for consistency with the reference implementations.

Applied to files:

  • source/tests/pt/test_adamuon.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (40)
  • GitHub Check: Analyze (python)
  • GitHub Check: Analyze (c-cpp)
  • GitHub Check: Build wheels for cp310-manylinux_aarch64
  • GitHub Check: Build wheels for cp311-macosx_x86_64
  • GitHub Check: Build wheels for cp311-macosx_arm64
  • GitHub Check: Build wheels for cp311-manylinux_x86_64
  • GitHub Check: Build wheels for cp311-win_amd64
  • GitHub Check: Test Python (12, 3.13)
  • GitHub Check: Test Python (4, 3.10)
  • GitHub Check: Test Python (10, 3.10)
  • GitHub Check: Test Python (12, 3.10)
  • GitHub Check: Test Python (11, 3.10)
  • GitHub Check: Test Python (10, 3.13)
  • GitHub Check: Test Python (11, 3.13)
  • GitHub Check: Test Python (6, 3.13)
  • GitHub Check: Test Python (1, 3.10)
  • GitHub Check: Test Python (3, 3.13)
  • GitHub Check: Test Python (4, 3.13)
  • GitHub Check: Test Python (7, 3.13)
  • GitHub Check: Test Python (7, 3.10)
  • GitHub Check: Test Python (6, 3.10)
  • GitHub Check: Test Python (2, 3.13)
  • GitHub Check: Test Python (2, 3.10)
  • GitHub Check: Test Python (3, 3.10)
  • GitHub Check: Test Python (9, 3.10)
  • GitHub Check: Test Python (8, 3.10)
  • GitHub Check: Test Python (9, 3.13)
  • GitHub Check: Test Python (5, 3.10)
  • GitHub Check: Test Python (5, 3.13)
  • GitHub Check: Test Python (1, 3.13)
  • GitHub Check: Test Python (8, 3.13)
  • GitHub Check: Build C library (2.18, libdeepmd_c.tar.gz)
  • GitHub Check: Build C++ (cuda120, cuda)
  • GitHub Check: Build C++ (cpu, cpu)
  • GitHub Check: Build C++ (rocm, rocm)
  • GitHub Check: Test C++ (false, false, false, true)
  • GitHub Check: Build C++ (clang, clang)
  • GitHub Check: Test C++ (true, false, false, true)
  • GitHub Check: Test C++ (false, true, true, false)
  • GitHub Check: Test C++ (true, true, true, false)
🔇 Additional comments (2)
source/tests/pt/test_adamuon.py (2)

100-102: LGTM!

Adding strict=True provides a fail-fast safeguard ensuring model.parameters() and initial_params have matching lengths. Since initial_params is derived from the same parameter generator, this is a sensible defensive check.


374-374: LGTM!

Using strict=True here ensures the test will fail immediately if the loaded optimizer has a different parameter count than the original, which would indicate a bug in state dict serialization.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In @pyproject.toml:
- Line 385: The linter rule B905 was added which requires all uses of zip() to
include an explicit strict= parameter; update every zip(...) occurrence (e.g.,
in functions like process_items, iterate_pairs, build_map, etc.) to use either
zip(..., strict=True) when you want mismatched lengths to raise an error, or
zip(..., strict=False) when mismatched lengths should be silently truncated; run
`ruff check --select B905 .` to list all 77 places and fix them accordingly, and
run the test suite/ruff again to verify no B905 violations remain.
📜 Review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cc85a6d and 7690428.

📒 Files selected for processing (1)
  • pyproject.toml
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: deepmodeling/deepmd-kit PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-12T13:40:14.334Z
Learning: Install and run ruff linter with `uv pip install ruff`, then run `ruff check .` and `ruff format .` before committing changes
📚 Learning: 2025-12-12T13:40:14.334Z
Learnt from: CR
Repo: deepmodeling/deepmd-kit PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-12T13:40:14.334Z
Learning: Install and run ruff linter with `uv pip install ruff`, then run `ruff check .` and `ruff format .` before committing changes

Applied to files:

  • pyproject.toml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (30)
  • GitHub Check: Build wheels for cp311-manylinux_x86_64
  • GitHub Check: Build wheels for cp310-manylinux_aarch64
  • GitHub Check: Test Python (5, 3.10)
  • GitHub Check: Test Python (12, 3.13)
  • GitHub Check: Test Python (10, 3.10)
  • GitHub Check: Test Python (7, 3.10)
  • GitHub Check: Test Python (6, 3.13)
  • GitHub Check: Build wheels for cp311-macosx_arm64
  • GitHub Check: Test Python (11, 3.13)
  • GitHub Check: Test Python (7, 3.13)
  • GitHub Check: Test Python (8, 3.13)
  • GitHub Check: Test Python (4, 3.13)
  • GitHub Check: Test Python (12, 3.10)
  • GitHub Check: Build wheels for cp311-win_amd64
  • GitHub Check: Test Python (5, 3.13)
  • GitHub Check: Test Python (4, 3.10)
  • GitHub Check: Build wheels for cp311-macosx_x86_64
  • GitHub Check: Test Python (1, 3.13)
  • GitHub Check: Test Python (8, 3.10)
  • GitHub Check: Test Python (1, 3.10)
  • GitHub Check: Test C++ (true, false, false, true)
  • GitHub Check: Test C++ (true, true, true, false)
  • GitHub Check: Test C++ (false, false, false, true)
  • GitHub Check: Analyze (c-cpp)
  • GitHub Check: Analyze (python)
  • GitHub Check: Build C++ (cpu, cpu)
  • GitHub Check: Build C library (2.18, libdeepmd_c.tar.gz)
  • GitHub Check: Build C++ (rocm, rocm)
  • GitHub Check: Build C++ (clang, clang)
  • GitHub Check: Build C++ (cuda120, cuda)

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enables the B905 linting rule (zip-without-explicit-strict) to enforce explicit use of the strict parameter in zip() calls, which helps prevent bugs caused by zipping iterables of different lengths.

  • Adds B905 rule to the Ruff linter configuration to enforce strict parameter in zip() calls

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
deepmd/pt/infer/deep_eval.py (1)

376-382: Use strict=True to catch mismatches between output names and results.

This zip pairs output definition names with computation results. If these don't match in length, it indicates a serious logic error where either:

  1. Not all requested outputs were computed, or
  2. Extra outputs were returned that don't correspond to definitions

Using strict=False would silently truncate and return an incomplete or incorrect result dictionary. Using strict=True would immediately surface such bugs.

✨ Suggested change
         return dict(
             zip(
                 [x.name for x in request_defs],
                 out,
-                strict=False,
+                strict=True,
             )
         )
🤖 Fix all issues with AI agents
In @deepmd/pd/utils/dataloader.py:
- Around line 169-170: The code asserts len(self.systems) ==
len(self.batch_sizes) but then iterates with for system, batch_size in
zip(self.systems, self.batch_sizes, strict=False), which is inconsistent; either
make the zip strict by changing strict=False to strict=True to enforce the
asserted invariant, or remove the assert if you intentionally want tolerant
iteration; update the zip call in the method where self.systems and
self.batch_sizes are used to reflect the chosen behavior (use strict=True to
align with the assertion or drop the assertion to keep strict=False).

In @deepmd/utils/model_branch_dict.py:
- Line 172: The loop using enumerate(zip(*wrapped_rows, strict=False)) in
model_branch_dict.py should use strict=True to enforce that all rows in
wrapped_rows match expected length; change the zip call to zip(..., strict=True)
(in the loop that iterates over cols for idx, col in enumerate(...)) so any
mismatch in len(self.headers) raises immediately and surfaces bugs; ensure tests
still pass and adjust any callers if they ever relied on padding behavior.
- Around line 190-191: The zip call using strict=False should be made strict to
surface mismatched-length bugs: find the expression "zip(cells_parts,
col_widths, strict=False)" (in model_branch_dict.py) and change it to use
strict=True so iteration fails fast if cells_parts and col_widths lengths
diverge; keep the surrounding logic that calls part.ljust(width) unchanged and
run tests to ensure no regressions.
🧹 Nitpick comments (37)
source/tests/pt/test_utils.py (1)

21-25: Consider using strict=True for better safety.

Since both lists are hardcoded with exactly 3 matching elements, using strict=True would be more appropriate. This would catch accidental modifications where one list is changed but not the other, aligning better with the PR's goal to "prevent issues with zip."

✨ Suggested refactor
         for ptp, npp in zip(
             [torch.float16, torch.float32, torch.float64],
             [np.float16, np.float32, np.float64],
-            strict=False,
+            strict=True,
         ):
deepmd/pt/train/training.py (1)

653-655: Consider using strict=True instead of strict=False.

By construction, self.model_keys and self.model_prob always have matching lengths. Line 626 initializes self.model_prob with exactly len(self.model_keys) elements, and subsequent operations (lines 629-637) only modify values, not the array length. Using strict=True would be more defensive and help catch potential bugs if the initialization logic changes in the future.

🔒 Proposed change to use strict=True
             model_key_prob_map=dict(
-                zip(self.model_keys, self.model_prob, strict=False)
+                zip(self.model_keys, self.model_prob, strict=True)
             ),
deepmd/tf/nvnmd/entrypoints/mapt.py (1)

437-446: Consider using strict=True for better error detection.

The pattern here extracts keys and values from dic_ph, processes the values through run_sess, then reconstructs a dictionary. Since TensorFlow's sess.run() returns results matching the input tensor list, len(keys) should always equal len(res_lst) and len(res_lst2).

Using strict=False silently ignores length mismatches, which could hide serious bugs. Consider using strict=True to fail fast if the invariant is violated:

res_dic = dict(zip(keys, res_lst, strict=True))

The same applies to line 446.

source/tests/consistent/io/test_io.py (2)

212-212: Consider verifying that result lengths match across backends.

Similar to line 205, using strict=False here could hide unexpected length mismatches between backends.


205-205: Add explicit documentation or assertion to clarify expected behavior for variable-length backend results.

The code uses strict=False with the NaN handling comment "expect all nan if not supported", suggesting intentional tolerance for backends returning different numbers of results. However, this design choice lacks explicit documentation or assertions.

Consider adding either:

  1. A comment explaining that backends may return different lengths when features are unsupported
  2. An assertion verifying both iterables match in length when neither is all-NaN

This clarifies whether the silent truncation is intentional protection against unsupported features or a gap in verification.

deepmd/calculator.py (1)

106-106: Consider using strict=True since lengths are guaranteed to match.

Based on the relevant code snippets, get_ntypes() returns len(self.type_map), which means range(self.dp.get_ntypes()) will always have exactly the same length as get_type_map(). Using strict=False is unnecessarily permissive and could mask bugs if these ever get out of sync.

📝 Suggested change
-            zip(self.dp.get_type_map(), range(self.dp.get_ntypes()), strict=False)
+            zip(self.dp.get_type_map(), range(self.dp.get_ntypes()), strict=True)
deepmd/pd/utils/nlist.py (1)

407-407: Consider using strict=True to match the length assertion.

Line 365 contains an assertion assert len(rcuts) == len(nsels) (active only in dynamic mode), indicating that these lists are expected to have matching lengths. Using strict=True would provide consistent length checking in all execution modes and help catch bugs earlier.

📝 Suggested change
-    for rc, ns in zip(rcuts[::-1], nsels[::-1], strict=False):
+    for rc, ns in zip(rcuts[::-1], nsels[::-1], strict=True):
deepmd/pt/model/descriptor/se_a.py (1)

788-793: Consider using strict=True since these lists are initialized with matching lengths.

At lines 550-561, compress_info and compress_data are both initialized as ParameterList with length len(self.filter_layers.networks). Since all three lists are guaranteed to have the same length, using strict=True would help catch any logic errors that might cause them to get out of sync.

📝 Suggested change
         for embedding_idx, (ll, compress_data_ii, compress_info_ii) in enumerate(
             zip(
                 self.filter_layers.networks,
                 self.compress_data,
                 self.compress_info,
-                strict=False,
+                strict=True,
             )
         ):
source/tests/consistent/fitting/test_dipole.py (1)

288-288: Consider verifying that result lengths match across backends.

Using strict=False allows the zip to silently truncate if ret1 and ret2 have different lengths. This could mask bugs where backends unexpectedly return different numbers of results. Consider adding a length assertion before the comparison loop to ensure backends are consistent.

📝 Suggested fix
+        assert len(ret1) == len(ret2), f"Result length mismatch: {len(ret1)} vs {len(ret2)}"
-        for rr1, rr2 in zip(ret1, ret2, strict=False):
+        for rr1, rr2 in zip(ret1, ret2, strict=True):
source/tests/common/dpmodel/test_pairtab_atomic_model.py (1)

128-162: Explicit strict=False correctly preserves existing behavior.

The addition of strict=False makes the zip behavior explicit, satisfying the B905 linting rule while preserving the current non-strict iteration behavior.

Since both input lists have exactly 14 elements and should always remain paired, consider using strict=True instead to catch potential errors if the lists diverge during future modifications:

♻️ Optional refinement
-        for dist, rcut in zip(
-            [
-                0.01,
-                0.015,
-                0.020,
-                0.015,
-                0.02,
-                0.021,
-                0.015,
-                0.02,
-                0.021,
-                0.025,
-                0.026,
-                0.025,
-                0.025,
-                0.0216161,
-            ],
-            [
-                0.015,
-                0.015,
-                0.015,
-                0.02,
-                0.02,
-                0.02,
-                0.022,
-                0.022,
-                0.022,
-                0.025,
-                0.025,
-                0.03,
-                0.035,
-                0.025,
-            ],
-            strict=False,
-        ):
+        for dist, rcut in zip(
+            [
+                0.01,
+                0.015,
+                0.020,
+                0.015,
+                0.02,
+                0.021,
+                0.015,
+                0.02,
+                0.021,
+                0.025,
+                0.026,
+                0.025,
+                0.025,
+                0.0216161,
+            ],
+            [
+                0.015,
+                0.015,
+                0.015,
+                0.02,
+                0.02,
+                0.02,
+                0.022,
+                0.022,
+                0.022,
+                0.025,
+                0.025,
+                0.03,
+                0.035,
+                0.025,
+            ],
+            strict=True,
+        ):
deepmd/pt/model/model/transform_output.py (1)

108-149: Explicit strict=False correctly preserves existing behavior.

The change makes the zip behavior explicit, satisfying the B905 linting rule. Both split_vv1 and split_svv1 are created by splitting with [1] * size (lines 122-123), so they have the same length.

Since both sequences are derived from the same size variable and should always have matching lengths, using strict=True would provide defensive error checking:

♻️ Optional refinement
-    for vvi, svvi in zip(split_vv1, split_svv1, strict=False):
+    for vvi, svvi in zip(split_vv1, split_svv1, strict=True):
source/tests/pd/test_utils.py (1)

21-25: Explicit strict=False correctly preserves existing behavior.

The addition of strict=False satisfies the B905 linting rule while maintaining the current non-strict iteration behavior.

Both lists contain exactly 3 elements and should remain paired. Consider strict=True for better error detection:

♻️ Optional refinement
-        for ptp, npp in zip(
-            [paddle.float16, paddle.float32, paddle.float64],
-            [np.float16, np.float32, np.float64],
-            strict=False,
-        ):
+        for ptp, npp in zip(
+            [paddle.float16, paddle.float32, paddle.float64],
+            [np.float16, np.float32, np.float64],
+            strict=True,
+        ):
deepmd/jax/infer/deep_eval.py (1)

248-254: Explicit strict=False correctly preserves existing behavior.

The change makes the zip behavior explicit, satisfying the B905 linting rule. The results tuple is constructed by iterating over request_defs (lines 381-397), so both sequences should have matching lengths.

Since the results list is built by iterating over the same request_defs, using strict=True would catch programming errors if the sequences inadvertently diverge:

♻️ Optional refinement
         return dict(
             zip(
                 [x.name for x in request_defs],
                 out,
-                strict=False,
+                strict=True,
             )
         )
source/tests/pt/model/test_nlist.py (1)

248-250: Explicit strict=False correctly preserves existing behavior.

The addition of strict=False satisfies the B905 linting rule while preserving the current non-strict iteration behavior.

Both lists contain exactly 2 elements and are meant to be paired for test parameterization. Consider strict=True for better error detection:

♻️ Optional refinement
-        for distinguish_types, mysel in zip(
-            [True, False], [sum(self.nsel), 300], strict=False
-        ):
+        for distinguish_types, mysel in zip(
+            [True, False], [sum(self.nsel), 300], strict=True
+        ):
deepmd/pt/model/descriptor/hybrid.py (1)

98-98: Consider using strict=True to catch potential length mismatches.

While strict=False preserves backward compatibility, start_idx and end_idx should always have the same length based on the construction logic. Using strict=True would catch any bugs where this assumption is violated, making the code more robust.

🔒 Suggested change for stricter validation
-                [range(ss, ee) for ss, ee in zip(start_idx, end_idx, strict=False)]
+                [range(ss, ee) for ss, ee in zip(start_idx, end_idx, strict=True)]
deepmd/pt/optimizer/LKF.py (1)

295-295: Consider using strict=True for safer parameter updates.

Both tmp_grad and tmp are produced by __split_weights() applied to the gradient and parameter of the same tensor, so they should always have matching lengths. Using strict=True would catch any logic errors in the splitting process, which is particularly important in optimizer code where silent bugs could corrupt training.

🔒 Suggested change for stricter validation
-            for split_grad, split_weight in zip(tmp_grad, tmp, strict=False):
+            for split_grad, split_weight in zip(tmp_grad, tmp, strict=True):
deepmd/pd/infer/deep_eval.py (1)

384-389: Consider using strict=True to ensure all outputs are returned.

The output dictionary is constructed by zipping output names from request_defs with results from _eval_model. These should always have matching lengths by design. Using strict=True would immediately catch any bugs where the number of outputs doesn't match expectations, rather than silently dropping outputs, which could lead to confusing missing keys in the returned dictionary.

🔒 Suggested change for stricter validation
         return dict(
             zip(
                 [x.name for x in request_defs],
                 out,
-                strict=False,
+                strict=True,
             )
         )
source/tests/pt/model/test_pairtab_atomic_model.py (1)

189-223: Consider using strict=True for safer zip behavior.

The two lists being zipped (lines 190-221) both contain exactly 14 elements and are designed to be paired together for testing various distance/rcut combinations. If these lists were to have mismatched lengths due to a coding error, using strict=True would immediately catch the bug rather than silently truncating to the shorter list.

✨ Suggested change
-        for dist, rcut in zip(
-            [
-                0.01,
-                0.015,
-                0.020,
-                0.015,
-                0.02,
-                0.021,
-                0.015,
-                0.02,
-                0.021,
-                0.025,
-                0.026,
-                0.025,
-                0.025,
-                0.0216161,
-            ],
-            [
-                0.015,
-                0.015,
-                0.015,
-                0.02,
-                0.02,
-                0.02,
-                0.022,
-                0.022,
-                0.022,
-                0.025,
-                0.025,
-                0.03,
-                0.035,
-                0.025,
-            ],
-            strict=False,
-        ):
+        for dist, rcut in zip(
+            [
+                0.01,
+                0.015,
+                0.020,
+                0.015,
+                0.02,
+                0.021,
+                0.015,
+                0.02,
+                0.021,
+                0.025,
+                0.026,
+                0.025,
+                0.025,
+                0.0216161,
+            ],
+            [
+                0.015,
+                0.015,
+                0.015,
+                0.02,
+                0.02,
+                0.02,
+                0.022,
+                0.022,
+                0.022,
+                0.025,
+                0.025,
+                0.03,
+                0.035,
+                0.025,
+            ],
+            strict=True,
+        ):
source/tests/tf/test_linear_model.py (1)

40-41: Consider using strict=True since the lists are guaranteed to have matching lengths.

Line 39 shows that self.graph_dirs is derived from self.pbtxts via list comprehension, ensuring they always have identical lengths. Using strict=True would provide better error detection if future refactoring accidentally breaks this relationship.

✨ Suggested change
-        for pbtxt, pb in zip(self.pbtxts, self.graph_dirs, strict=False):
+        for pbtxt, pb in zip(self.pbtxts, self.graph_dirs, strict=True):
             convert_pbtxt_to_pb(pbtxt, pb)
deepmd/pd/model/model/transform_output.py (1)

125-134: Consider using strict=True to ensure split operations produce matching chunks.

Lines 122-123 show both tensors are split using the same pattern [1] * size, which should produce equal-length lists. If the lengths differ, it indicates a serious issue with the tensor shapes or split operation. Using strict=True would catch such errors immediately rather than silently processing mismatched data.

✨ Suggested change
-    for vvi, svvi in zip(split_vv1, split_svv1, strict=False):
+    for vvi, svvi in zip(split_vv1, split_svv1, strict=True):
         # nf x nloc x 3, nf x nloc x 9
         ffi, aviri = task_deriv_one(
             vvi,
             svvi,
             coord_ext,
             do_virial=do_virial,
             do_atomic_virial=do_atomic_virial,
             create_graph=create_graph,
         )
deepmd/pt/model/descriptor/se_t.py (1)

847-854: Consider using strict=True to validate model structure consistency.

The three collections being zipped are initialized with matching lengths (lines 598-609): compress_info and compress_data are both created with range(len(self.filter_layers.networks)). If these lengths were to differ, it would indicate model corruption or an initialization bug. Using strict=True would catch such structural issues immediately during the forward pass.

✨ Suggested change
         for embedding_idx, (ll, compress_data_ii, compress_info_ii) in enumerate(
             zip(
                 self.filter_layers.networks,
                 self.compress_data,
                 self.compress_info,
-                strict=False,
+                strict=True,
             )
         ):
deepmd/dpmodel/infer/deep_eval.py (1)

228-234: Consider using strict=True to catch output mismatches.

The strict=False preserves backward compatibility, but using strict=True would catch logic errors where the number of output definitions doesn't match the number of results returned by the model.

🔧 Suggested improvement
         return dict(
             zip(
                 [x.name for x in request_defs],
                 out,
-                strict=False,
+                strict=True,
             )
         )
deepmd/dpmodel/utils/nlist.py (1)

237-237: Consider strict=True for consistency with the length assertion.

Since line 216 asserts len(rcuts) == len(nsels), using strict=True would provide redundant validation and a clearer error context if the assertion is somehow bypassed or if the assertion is removed in the future.

🔧 Suggested improvement
-    for rc, ns in zip(rcuts[::-1], nsels[::-1], strict=False):
+    for rc, ns in zip(rcuts[::-1], nsels[::-1], strict=True):
deepmd/pt/model/descriptor/se_r.py (1)

498-505: Consider strict=True to catch initialization issues.

The three iterables (filter_layers.networks, compress_data, compress_info) are initialized with the same length (lines 157-168). Using strict=True would detect any initialization bugs where the lengths diverge.

🔧 Suggested improvement
         for ii, (ll, compress_data_ii, compress_info_ii) in enumerate(
             zip(
                 self.filter_layers.networks,
                 self.compress_data,
                 self.compress_info,
-                strict=False,
+                strict=True,
             )
         ):
deepmd/pt/utils/nlist.py (1)

401-401: Consider strict=True for consistency with the length assertion.

Since line 366 asserts len(rcuts) == len(nsels), using strict=True would provide redundant validation and a clearer error context if the lengths somehow become mismatched.

🔧 Suggested improvement
-    for rc, ns in zip(rcuts[::-1], nsels[::-1], strict=False):
+    for rc, ns in zip(rcuts[::-1], nsels[::-1], strict=True):
deepmd/pd/model/descriptor/se_a.py (1)

765-771: Consider using strict=True instead of strict=False.

The three iterables (self.filter_layers.networks, self.compress_data, self.compress_info) are constructed together in __init__ (lines 525-536) with matching lengths. Using strict=False silently truncates to the shortest iterable if a programming error causes length mismatches, potentially masking bugs. Since equal lengths are expected, strict=True would catch such errors early.

🔍 Suggested change
-        for embedding_idx, (ll, compress_data_ii, compress_info_ii) in enumerate(
-            zip(
-                self.filter_layers.networks,
-                self.compress_data,
-                self.compress_info,
-                strict=False,
-            )
-        ):
+        for embedding_idx, (ll, compress_data_ii, compress_info_ii) in enumerate(
+            zip(
+                self.filter_layers.networks,
+                self.compress_data,
+                self.compress_info,
+                strict=True,
+            )
+        ):
deepmd/dpmodel/descriptor/hybrid.py (2)

104-104: Use strict=True for this zip call.

start_idx and end_idx are computed from the same hybrid_sel array (lines 101-102), so they are guaranteed to have the same length. Using strict=True would detect any future programming errors that violate this invariant.

🔍 Suggested change
             cut_idx = np.concatenate(
-                [range(ss, ee) for ss, ee in zip(start_idx, end_idx, strict=False)]
+                [range(ss, ee) for ss, ee in zip(start_idx, end_idx, strict=True)]
             )

313-313: Use strict=True for this zip call.

self.nlist_cut_idx is constructed with exactly one entry per descriptor in the loop at lines 94-106, so it should always have the same length as self.descrpt_list. Using strict=True ensures this invariant is maintained and catches bugs early.

🔍 Suggested change
-        for descrpt, nci in zip(self.descrpt_list, self.nlist_cut_idx, strict=False):
+        for descrpt, nci in zip(self.descrpt_list, self.nlist_cut_idx, strict=True):
source/tests/consistent/common.py (1)

358-358: Use strict=True in test comparison loops to catch backend inconsistencies.

These zip calls compare results from different backend implementations (TF, PT, DP, JAX, PD, array_api_strict). If backends return different numbers of results, that indicates an implementation bug. Using strict=False silently masks such inconsistencies by only comparing the overlapping elements. Use strict=True to ensure all backends produce the same number of outputs.

🔍 Example fix for line 358
-        for rr1, rr2 in zip(ret1, ret2, strict=False):
+        for rr1, rr2 in zip(ret1, ret2, strict=True):

Apply the same change to all similar zip calls in this file (lines 375, 394, 410, 439, 452, 476, 489, 518, 531, 553, 567).

Also applies to: 375-375, 394-394, 410-410, 439-439, 452-452, 476-476, 489-489, 518-518, 531-531, 553-553, 567-567

deepmd/tf/descriptor/se_atten.py (2)

381-389: Use strict=True for data integrity checking.

The parallel data arrays (data_coord, data_box, data_atype, natoms_vec, mesh, real_natoms_vec) represent the same batch of frames and should always have matching lengths. Using strict=False allows silent data corruption if these arrays somehow become misaligned. Use strict=True to validate data integrity.

🔍 Suggested change
                 for cc, bb, tt, nn, mm, r_n in zip(
                     data_coord,
                     data_box,
                     data_atype,
                     natoms_vec,
                     mesh,
                     real_natoms_vec,
-                    strict=False,
+                    strict=True,
                 ):

400-401: Use strict=True for data integrity checking.

The parallel data arrays (data_coord, data_box, data_atype, natoms_vec, mesh) represent the same batch of frames and should always have matching lengths. Using strict=False allows silent data corruption if these arrays become misaligned. Use strict=True to validate data integrity.

🔍 Suggested change
                 for cc, bb, tt, nn, mm in zip(
-                    data_coord, data_box, data_atype, natoms_vec, mesh, strict=False
+                    data_coord, data_box, data_atype, natoms_vec, mesh, strict=True
                 ):
deepmd/utils/data_system.py (1)

175-177: Use strict=True to maintain invariant between system_dirs and data_systems.

self.system_dirs and self.data_systems are constructed together in __init__ (lines 100-117), with one DeepmdData object created for each system directory. They should always have the same length. Using strict=False allows silent bugs if this invariant is violated. Use strict=True to catch such programming errors early.

🔍 Suggested change
                 for sys_dir, data_sys in zip(
-                    self.system_dirs, self.data_systems, strict=False
+                    self.system_dirs, self.data_systems, strict=True
                 ):
deepmd/dpmodel/atomic_model/linear_atomic_model.py (3)

157-161: Consider using strict=True for added safety.

The iterables self.get_model_rcuts() and self.get_model_nsels() both derive from self.models, so they should always have the same length. Using strict=True would help catch bugs if these lists ever become desynchronized.

Proposed change
         zipped = sorted(
-            zip(self.get_model_rcuts(), self.get_model_nsels(), strict=False),
+            zip(self.get_model_rcuts(), self.get_model_nsels(), strict=True),
             key=lambda x: (x[1], x[0]),
         )

236-241: Consider using strict=True for added safety.

The iterables self.get_model_rcuts() and self.get_model_nsels() both derive from self.models, so they should always have the same length. Using strict=True would help catch bugs if these lists ever become desynchronized.

Proposed change
         raw_nlists = [
             nlists[get_multiple_nlist_key(rcut, sel)]
-            for rcut, sel in zip(
-                self.get_model_rcuts(), self.get_model_nsels(), strict=False
-            )
+            for rcut, sel in zip(
+                self.get_model_rcuts(), self.get_model_nsels(), strict=True
+            )
         ]

242-247: Consider using strict=True for added safety.

The iterables self.mixed_types_list, raw_nlists, and self.get_model_sels() all derive from self.models and should have the same length. Using strict=True would help catch bugs if these lists ever become desynchronized.

Proposed change
         nlists_ = [
             nl if mt else nlist_distinguish_types(nl, extended_atype, sel)
             for mt, nl, sel in zip(
-                self.mixed_types_list, raw_nlists, self.get_model_sels(), strict=False
+                self.mixed_types_list, raw_nlists, self.get_model_sels(), strict=True
             )
         ]
source/tests/pt/model/test_descriptor_se_r.py (1)

103-110: Optional: Consider using strict=True for test robustness.

The zip iterates over two fixed-length lists [rd1, sw1] and [rd2, sw2], both containing exactly 2 elements. Using strict=True would be more explicit and protect against future modifications that might introduce length mismatches.

Proposed change
-            for aa, bb in zip([rd1, sw1], [rd2, sw2], strict=False):
+            for aa, bb in zip([rd1, sw1], [rd2, sw2], strict=True):
                 np.testing.assert_allclose(
                     aa.detach().cpu().numpy(),
                     bb,
deepmd/tf/descriptor/se_r.py (1)

273-275: Consider using strict=True to validate data alignment.

The zip iterates over training data arrays (data_coord, data_box, data_atype, natoms_vec, mesh) that should all have the same length (representing the same number of systems or frames). Using strict=False will silently process only up to the shortest array, potentially hiding data corruption or misalignment issues. Using strict=True would catch such problems early during statistics computation.

Proposed change
         for cc, bb, tt, nn, mm in zip(
-            data_coord, data_box, data_atype, natoms_vec, mesh, strict=False
+            data_coord, data_box, data_atype, natoms_vec, mesh, strict=True
         ):
📜 Review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7690428 and f8aae86.

📒 Files selected for processing (52)
  • deepmd/calculator.py
  • deepmd/dpmodel/atomic_model/linear_atomic_model.py
  • deepmd/dpmodel/descriptor/hybrid.py
  • deepmd/dpmodel/infer/deep_eval.py
  • deepmd/dpmodel/utils/nlist.py
  • deepmd/jax/infer/deep_eval.py
  • deepmd/pd/infer/deep_eval.py
  • deepmd/pd/model/descriptor/se_a.py
  • deepmd/pd/model/model/transform_output.py
  • deepmd/pd/utils/dataloader.py
  • deepmd/pd/utils/nlist.py
  • deepmd/pt/infer/deep_eval.py
  • deepmd/pt/model/atomic_model/linear_atomic_model.py
  • deepmd/pt/model/descriptor/hybrid.py
  • deepmd/pt/model/descriptor/se_a.py
  • deepmd/pt/model/descriptor/se_r.py
  • deepmd/pt/model/descriptor/se_t.py
  • deepmd/pt/model/model/transform_output.py
  • deepmd/pt/optimizer/LKF.py
  • deepmd/pt/train/training.py
  • deepmd/pt/utils/dataloader.py
  • deepmd/pt/utils/nlist.py
  • deepmd/tf/descriptor/loc_frame.py
  • deepmd/tf/descriptor/se_a.py
  • deepmd/tf/descriptor/se_a_ef.py
  • deepmd/tf/descriptor/se_atten.py
  • deepmd/tf/descriptor/se_r.py
  • deepmd/tf/descriptor/se_t.py
  • deepmd/tf/infer/deep_eval.py
  • deepmd/tf/nvnmd/entrypoints/mapt.py
  • deepmd/tf/train/trainer.py
  • deepmd/utils/batch_size.py
  • deepmd/utils/data_system.py
  • deepmd/utils/model_branch_dict.py
  • deepmd/utils/update_sel.py
  • source/tests/common/dpmodel/test_pairtab_atomic_model.py
  • source/tests/consistent/common.py
  • source/tests/consistent/fitting/test_dipole.py
  • source/tests/consistent/io/test_io.py
  • source/tests/pd/model/test_nlist.py
  • source/tests/pd/model/test_se_e2_a.py
  • source/tests/pd/test_utils.py
  • source/tests/pt/model/test_descriptor_se_r.py
  • source/tests/pt/model/test_nlist.py
  • source/tests/pt/model/test_pairtab_atomic_model.py
  • source/tests/pt/model/test_se_e2_a.py
  • source/tests/pt/model/test_se_t.py
  • source/tests/pt/test_utils.py
  • source/tests/tf/test_linear_model.py
  • source/tests/tf/test_model_devi.py
  • source/tests/tf/test_model_devi_mix.py
  • source/tests/tf/test_nvnmd_entrypoints.py
✅ Files skipped from review due to trivial changes (7)
  • deepmd/tf/descriptor/loc_frame.py
  • deepmd/pt/model/atomic_model/linear_atomic_model.py
  • source/tests/pt/model/test_se_e2_a.py
  • deepmd/tf/descriptor/se_a.py
  • deepmd/tf/train/trainer.py
  • deepmd/tf/infer/deep_eval.py
  • source/tests/tf/test_model_devi_mix.py
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2024-10-08T15:32:11.479Z
Learnt from: njzjz
Repo: deepmodeling/deepmd-kit PR: 4160
File: deepmd/dpmodel/utils/env_mat.py:52-64
Timestamp: 2024-10-08T15:32:11.479Z
Learning: Negative indices in `nlist` are properly handled by masking later in the computation, so they do not cause issues in indexing operations.

Applied to files:

  • deepmd/dpmodel/utils/nlist.py
  • deepmd/pt/utils/nlist.py
  • deepmd/pd/utils/nlist.py
📚 Learning: 2024-10-08T15:32:11.479Z
Learnt from: njzjz
Repo: deepmodeling/deepmd-kit PR: 4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-10-08T15:32:11.479Z
Learning: Refactoring between test classes `TestInferDeepPotDpaPt` and `TestInferDeepPotDpaPtNopbc` is addressed in PR #3905.

Applied to files:

  • source/tests/tf/test_model_devi.py
🧬 Code graph analysis (4)
deepmd/dpmodel/atomic_model/linear_atomic_model.py (1)
deepmd/pt/model/atomic_model/linear_atomic_model.py (3)
  • get_model_rcuts (159-161)
  • get_model_nsels (174-176)
  • get_model_sels (178-180)
deepmd/tf/nvnmd/entrypoints/mapt.py (2)
deepmd/tf/utils/sess.py (1)
  • run_sess (12-50)
deepmd/tf/infer/deep_eval.py (2)
  • sess (311-314)
  • sess (1234-1237)
source/tests/tf/test_linear_model.py (1)
source/api_c/src/c_api.cc (2)
  • pbtxt (2381-2381)
  • pb (2382-2382)
deepmd/calculator.py (2)
deepmd/dpmodel/infer/deep_eval.py (2)
  • get_type_map (116-118)
  • get_ntypes (112-114)
deepmd/utils/data.py (2)
  • get_type_map (325-327)
  • get_ntypes (318-323)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (33)
  • GitHub Check: Test Python (12, 3.13)
  • GitHub Check: Test Python (1, 3.10)
  • GitHub Check: Test Python (3, 3.10)
  • GitHub Check: Test Python (9, 3.10)
  • GitHub Check: Test Python (2, 3.13)
  • GitHub Check: Test Python (11, 3.13)
  • GitHub Check: Test Python (12, 3.10)
  • GitHub Check: Test Python (10, 3.10)
  • GitHub Check: Test Python (5, 3.10)
  • GitHub Check: Test Python (8, 3.13)
  • GitHub Check: Test Python (7, 3.13)
  • GitHub Check: Test Python (3, 3.13)
  • GitHub Check: Test Python (4, 3.13)
  • GitHub Check: Test Python (9, 3.13)
  • GitHub Check: Test Python (8, 3.10)
  • GitHub Check: Test Python (11, 3.10)
  • GitHub Check: Test Python (2, 3.10)
  • GitHub Check: Build C++ (rocm, rocm)
  • GitHub Check: Build C++ (cuda120, cuda)
  • GitHub Check: Build C++ (clang, clang)
  • GitHub Check: Build C++ (cpu, cpu)
  • GitHub Check: Test C++ (false, true, true, false)
  • GitHub Check: Test C++ (false, false, false, true)
  • GitHub Check: Test C++ (true, false, false, true)
  • GitHub Check: Test C++ (true, true, true, false)
  • GitHub Check: Build C library (2.18, libdeepmd_c.tar.gz)
  • GitHub Check: Analyze (c-cpp)
  • GitHub Check: Analyze (python)
  • GitHub Check: Build wheels for cp311-macosx_arm64
  • GitHub Check: Build wheels for cp311-manylinux_x86_64
  • GitHub Check: Build wheels for cp311-win_amd64
  • GitHub Check: Build wheels for cp311-macosx_x86_64
  • GitHub Check: Build wheels for cp310-manylinux_aarch64
🔇 Additional comments (12)
source/tests/pd/model/test_nlist.py (1)

256-258: LGTM!

The explicit strict=False preserves the existing behavior. Since both lists have the same hardcoded length (2), this change is safe and aligns with the PR objective of enabling B905.

source/tests/pt/model/test_se_t.py (1)

103-103: LGTM!

The explicit strict=False maintains backward compatibility for this consistency test. Both lists are constructed with the same length (2 elements each), so the change is safe.

deepmd/utils/update_sel.py (1)

46-46: Appropriate use of strict=False.

The strict=False parameter is correctly used here since user-provided sel may have a different length than tmp_sel, and the code intentionally handles length mismatches by iterating over the shorter sequence and issuing warnings (lines 47-54).

deepmd/utils/batch_size.py (1)

251-251: LGTM!

Adding strict=False maintains the existing behavior where zip() stops at the shortest iterable. This is appropriate for this lint-compliance change.

deepmd/tf/descriptor/se_t.py (1)

260-262: LGTM!

Adding strict=False maintains backward compatibility for the zip() call over parallel training data arrays.

source/tests/tf/test_model_devi.py (2)

44-45: LGTM!

The strict=False maintains existing behavior. Since self.graph_dirs is derived from self.pbtxts on line 43, the lengths are guaranteed to match.


219-220: LGTM!

Same pattern as the other test class - cls.graph_dirs is derived from cls.pbtxts, guaranteeing equal lengths.

source/tests/pd/model/test_se_e2_a.py (1)

99-106: LGTM!

Both lists are explicitly constructed with 3 elements each, so strict=False maintains existing behavior without any risk of silent truncation.

source/tests/tf/test_nvnmd_entrypoints.py (2)

469-469: LGTM!

Since valuelist comes from sess.run(list(tensordic.values()), ...), its length matches tensordic.keys() by construction.


786-786: LGTM!

Same pattern as above - the valuelist and tensordic.keys() are guaranteed to have matching lengths.

deepmd/tf/descriptor/se_a_ef.py (1)

421-429: LGTM!

Adding strict=False maintains backward compatibility for the multi-iterable zip() call over parallel training data arrays. This is consistent with similar changes in other descriptor modules.

deepmd/pt/utils/dataloader.py (1)

159-160: LGTM!

The assertion on line 159 already guarantees that self.systems and self.batch_sizes have equal lengths, making strict=False safe here.

@codecov
Copy link

codecov bot commented Jan 7, 2026

Codecov Report

❌ Patch coverage is 91.66667% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 81.95%. Comparing base (0a807cf) to head (94d2e05).
⚠️ Report is 2 commits behind head on master.

Files with missing lines Patch % Lines
deepmd/utils/data_system.py 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5136      +/-   ##
==========================================
- Coverage   81.95%   81.95%   -0.01%     
==========================================
  Files         713      713              
  Lines       73036    73036              
  Branches     3616     3617       +1     
==========================================
- Hits        59860    59858       -2     
  Misses      12014    12014              
- Partials     1162     1164       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In @deepmd/pd/utils/dataset.py:
- Line 38: Call to get_item_paddle is passing an extra num_workers argument but
the function get_item_paddle only accepts index; remove the second argument from
the call (change b_data = self._data_system.get_item_paddle(index, max(1,
NUM_WORKERS)) to call with only index) or alternatively update the
get_item_paddle definition to accept a second parameter named num_workers and
propagate/use it accordingly; prefer removing the extra argument to match other
call sites unless you intentionally need num_workers support in the
get_item_paddle implementation.
🧹 Nitpick comments (1)
pyproject.toml (1)

433-435: Consider tracking the B905 exceptions as technical debt.

The exceptions for deepmd/pt/** and deepmd/pd/** are a pragmatic approach for migrating a large codebase. However, for consistency and clarity, it would be beneficial to eventually add explicit strict=False (or strict=True where appropriate) to zip() calls in these directories as well.

Would you like me to open an issue to track removing these exceptions in a future PR?

📜 Review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 996a27c and 459b745.

📒 Files selected for processing (2)
  • deepmd/pd/utils/dataset.py
  • pyproject.toml
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2024-10-30T20:08:12.531Z
Learnt from: njzjz
Repo: deepmodeling/deepmd-kit PR: 4284
File: deepmd/jax/__init__.py:8-8
Timestamp: 2024-10-30T20:08:12.531Z
Learning: In the DeepMD project, entry points like `deepmd.jax` may be registered in external projects, so their absence in the local configuration files is acceptable.

Applied to files:

  • pyproject.toml
🧬 Code graph analysis (1)
deepmd/pd/utils/dataset.py (1)
deepmd/utils/data.py (1)
  • get_item_paddle (258-267)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (40)
  • GitHub Check: Test Python (10, 3.10)
  • GitHub Check: Test Python (5, 3.10)
  • GitHub Check: Build C library (2.18, libdeepmd_c.tar.gz)
  • GitHub Check: Test Python (7, 3.13)
  • GitHub Check: Test Python (9, 3.13)
  • GitHub Check: Test Python (5, 3.13)
  • GitHub Check: Test Python (12, 3.13)
  • GitHub Check: Test Python (2, 3.10)
  • GitHub Check: Test Python (10, 3.13)
  • GitHub Check: Test Python (1, 3.13)
  • GitHub Check: Test Python (4, 3.13)
  • GitHub Check: Test Python (8, 3.13)
  • GitHub Check: Test Python (12, 3.10)
  • GitHub Check: Test Python (6, 3.13)
  • GitHub Check: Test Python (3, 3.13)
  • GitHub Check: Test Python (6, 3.10)
  • GitHub Check: Test Python (7, 3.10)
  • GitHub Check: Test Python (9, 3.10)
  • GitHub Check: Test Python (11, 3.13)
  • GitHub Check: Test Python (3, 3.10)
  • GitHub Check: Test Python (11, 3.10)
  • GitHub Check: Test Python (4, 3.10)
  • GitHub Check: Test Python (8, 3.10)
  • GitHub Check: Test Python (2, 3.13)
  • GitHub Check: Test Python (1, 3.10)
  • GitHub Check: Test C++ (true, false, false, true)
  • GitHub Check: Analyze (c-cpp)
  • GitHub Check: Build wheels for cp311-macosx_x86_64
  • GitHub Check: Test C++ (true, true, true, false)
  • GitHub Check: Test C++ (false, true, true, false)
  • GitHub Check: Test C++ (false, false, false, true)
  • GitHub Check: Build wheels for cp311-win_amd64
  • GitHub Check: Build wheels for cp310-manylinux_aarch64
  • GitHub Check: Build wheels for cp311-macosx_arm64
  • GitHub Check: Build C++ (cpu, cpu)
  • GitHub Check: Build wheels for cp311-manylinux_x86_64
  • GitHub Check: Build C++ (rocm, rocm)
  • GitHub Check: Build C++ (clang, clang)
  • GitHub Check: Build C++ (cuda120, cuda)
  • GitHub Check: Analyze (python)
🔇 Additional comments (2)
pyproject.toml (1)

385-385: LGTM! Good addition of the B905 rule.

Enabling B905 is a solid improvement that will help catch potential bugs where zip() silently truncates to the shortest iterable. Since the project requires Python 3.10+ (line 58), the strict parameter is available in all supported versions.

deepmd/pd/utils/dataset.py (1)

8-10: LGTM: Import addition for worker-based parallelization.

The import is clean and necessary for the parallelization enhancement in line 38.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In @deepmd/utils/update_sel.py:
- Around line 46-48: The loop using zip(tmp_sel, sel, strict=False) silently
skips excess items when len(tmp_sel) != len(sel), so add an explicit length
mismatch check before the loop: compare len(tmp_sel) and len(sel) and if they
differ, log a warning (or use the existing logger) indicating the lengths and
that validation will only run for the first min-length items; then keep the
current zip(...) loop unchanged to preserve TF spin model behavior. Reference
tmp_sel, sel, the for ii, (tt, dd) in enumerate(zip(tmp_sel, sel, strict=False))
loop and the len(tmp_sel) != len(sel) condition when implementing the check.
📜 Review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between de735a7 and 69189db.

📒 Files selected for processing (1)
  • deepmd/utils/update_sel.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (40)
  • GitHub Check: Test Python (10, 3.13)
  • GitHub Check: Test Python (9, 3.13)
  • GitHub Check: Analyze (c-cpp)
  • GitHub Check: Analyze (python)
  • GitHub Check: Test Python (8, 3.10)
  • GitHub Check: Test Python (7, 3.10)
  • GitHub Check: Test Python (4, 3.13)
  • GitHub Check: Test Python (11, 3.10)
  • GitHub Check: Test Python (8, 3.13)
  • GitHub Check: Test Python (7, 3.13)
  • GitHub Check: Test Python (11, 3.13)
  • GitHub Check: Test Python (9, 3.10)
  • GitHub Check: Test Python (12, 3.13)
  • GitHub Check: Test Python (6, 3.13)
  • GitHub Check: Test Python (12, 3.10)
  • GitHub Check: Test Python (10, 3.10)
  • GitHub Check: Test Python (5, 3.13)
  • GitHub Check: Test Python (6, 3.10)
  • GitHub Check: Test Python (3, 3.13)
  • GitHub Check: Test Python (3, 3.10)
  • GitHub Check: Test Python (1, 3.13)
  • GitHub Check: Test Python (2, 3.13)
  • GitHub Check: Test Python (5, 3.10)
  • GitHub Check: Test Python (1, 3.10)
  • GitHub Check: Test Python (4, 3.10)
  • GitHub Check: Test Python (2, 3.10)
  • GitHub Check: Build C library (2.18, libdeepmd_c.tar.gz)
  • GitHub Check: Test C++ (true, true, true, false)
  • GitHub Check: Test C++ (false, true, true, false)
  • GitHub Check: Test C++ (false, false, false, true)
  • GitHub Check: Test C++ (true, false, false, true)
  • GitHub Check: Build wheels for cp311-manylinux_x86_64
  • GitHub Check: Build wheels for cp311-macosx_arm64
  • GitHub Check: Build wheels for cp311-win_amd64
  • GitHub Check: Build wheels for cp310-manylinux_aarch64
  • GitHub Check: Build wheels for cp311-macosx_x86_64
  • GitHub Check: Build C++ (cpu, cpu)
  • GitHub Check: Build C++ (clang, clang)
  • GitHub Check: Build C++ (rocm, rocm)
  • GitHub Check: Build C++ (cuda120, cuda)

@njzjz njzjz requested review from iProzd and wanghan-iapcm January 10, 2026 18:21
@iProzd iProzd added this pull request to the merge queue Jan 12, 2026
github-merge-queue bot pushed a commit that referenced this pull request Jan 12, 2026
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

## Release Notes

* **Bug Fixes**
* Enhanced validation of paired data sequences in atomic models,
descriptors, and inference modules to better detect inconsistencies and
mismatches.
* Improved error detection across data processing and training
pipelines.

* **Tests**
* Strengthened test validation with more rigorous consistency checks
across multiple test suites.

* **Chores**
  * Updated linting configuration for improved code quality standards.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Jinzhe Zeng <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jan 12, 2026
@iProzd iProzd added this pull request to the merge queue Jan 12, 2026
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jan 12, 2026
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
pyproject.toml (1)

434-436: Document the rationale for selectively ignoring B905 in pt and pd backends.

B905 is ignored for deepmd/pt/** and deepmd/pd/** but enforced for deepmd/tf/** and deepmd/jax/**. This asymmetry appears intentional (pt has 14 zip() calls, pd has 6, while tf enforces B905 despite having 15 zip() calls). Add an inline comment in pyproject.toml explaining why PyTorch and Paddle backends exclude B905 while TensorFlow and JAX enforce it—for example, whether these backends have intentional zip truncation behavior or refactoring constraints that justify the exclusion.

📜 Review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3c8adc3 and f82cffe.

📒 Files selected for processing (10)
  • pyproject.toml
  • source/tests/pd/model/test_nlist.py
  • source/tests/pd/model/test_se_e2_a.py
  • source/tests/pd/test_utils.py
  • source/tests/pt/model/test_descriptor_se_r.py
  • source/tests/pt/model/test_nlist.py
  • source/tests/pt/model/test_pairtab_atomic_model.py
  • source/tests/pt/model/test_se_e2_a.py
  • source/tests/pt/model/test_se_t.py
  • source/tests/pt/test_utils.py
🚧 Files skipped from review as they are similar to previous changes (5)
  • source/tests/pt/model/test_pairtab_atomic_model.py
  • source/tests/pd/model/test_se_e2_a.py
  • source/tests/pt/model/test_nlist.py
  • source/tests/pt/model/test_se_t.py
  • source/tests/pd/model/test_nlist.py
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-12-12T13:40:14.334Z
Learnt from: CR
Repo: deepmodeling/deepmd-kit PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-12T13:40:14.334Z
Learning: Install and run ruff linter with `uv pip install ruff`, then run `ruff check .` and `ruff format .` before committing changes

Applied to files:

  • pyproject.toml
📚 Learning: 2025-12-12T13:40:14.334Z
Learnt from: CR
Repo: deepmodeling/deepmd-kit PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-12T13:40:14.334Z
Learning: Always run `ruff check . && ruff format .` before committing changes to ensure CI passes

Applied to files:

  • pyproject.toml
📚 Learning: 2024-10-30T20:08:12.531Z
Learnt from: njzjz
Repo: deepmodeling/deepmd-kit PR: 4284
File: deepmd/jax/__init__.py:8-8
Timestamp: 2024-10-30T20:08:12.531Z
Learning: In the DeepMD project, entry points like `deepmd.jax` may be registered in external projects, so their absence in the local configuration files is acceptable.

Applied to files:

  • pyproject.toml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (40)
  • GitHub Check: Test Python (12, 3.13)
  • GitHub Check: Test Python (8, 3.10)
  • GitHub Check: Test Python (10, 3.10)
  • GitHub Check: Test Python (9, 3.10)
  • GitHub Check: Test Python (8, 3.13)
  • GitHub Check: Test Python (7, 3.13)
  • GitHub Check: Test Python (7, 3.10)
  • GitHub Check: Test Python (11, 3.13)
  • GitHub Check: Test Python (12, 3.10)
  • GitHub Check: Test Python (3, 3.13)
  • GitHub Check: Test Python (11, 3.10)
  • GitHub Check: Test Python (5, 3.10)
  • GitHub Check: Test Python (9, 3.13)
  • GitHub Check: Test Python (6, 3.10)
  • GitHub Check: Test Python (10, 3.13)
  • GitHub Check: Test Python (5, 3.13)
  • GitHub Check: Test Python (6, 3.13)
  • GitHub Check: Test Python (1, 3.13)
  • GitHub Check: Test Python (3, 3.10)
  • GitHub Check: Test Python (4, 3.13)
  • GitHub Check: Test Python (4, 3.10)
  • GitHub Check: Test Python (2, 3.10)
  • GitHub Check: Test Python (2, 3.13)
  • GitHub Check: Test Python (1, 3.10)
  • GitHub Check: Build wheels for cp311-macosx_arm64
  • GitHub Check: Build wheels for cp310-manylinux_aarch64
  • GitHub Check: Test C++ (true, false, false, true)
  • GitHub Check: Analyze (c-cpp)
  • GitHub Check: Build C library (2.18, libdeepmd_c.tar.gz)
  • GitHub Check: Test C++ (false, false, false, true)
  • GitHub Check: Test C++ (false, true, true, false)
  • GitHub Check: Analyze (python)
  • GitHub Check: Test C++ (true, true, true, false)
  • GitHub Check: Build wheels for cp311-win_amd64
  • GitHub Check: Build wheels for cp311-manylinux_x86_64
  • GitHub Check: Build wheels for cp311-macosx_x86_64
  • GitHub Check: Build C++ (cuda120, cuda)
  • GitHub Check: Build C++ (clang, clang)
  • GitHub Check: Build C++ (rocm, rocm)
  • GitHub Check: Build C++ (cpu, cpu)
🔇 Additional comments (5)
source/tests/pd/test_utils.py (1)

21-25: LGTM!

Adding strict=True is appropriate here since the Paddle and NumPy dtype lists are intentionally paired 1:1. This ensures the test will fail fast if someone accidentally adds/removes a dtype from only one list.

source/tests/pt/model/test_se_e2_a.py (1)

99-106: LGTM!

Adding strict=True is appropriate here since both lists are explicitly defined with exactly 3 elements. This ensures the test will fail fast with a clear error if future modifications accidentally cause a length mismatch, rather than silently skipping comparisons.

source/tests/pt/model/test_descriptor_se_r.py (1)

103-110: LGTM!

The addition of strict=True is appropriate here. Both iterables are hardcoded 2-element lists, so this change is safe while also ensuring that any future refactoring that accidentally introduces a length mismatch will be caught immediately.

source/tests/pt/test_utils.py (1)

21-25: LGTM!

Adding strict=True is the correct choice here since both lists are hardcoded with the same length (3 elements each) and represent corresponding PyTorch/NumPy dtype pairs. This ensures any future accidental mismatch in list lengths will raise a ValueError rather than silently truncating.

pyproject.toml (1)

386-386: Good addition of B905 lint rule.

Enabling zip-without-explicit-strict helps catch potential bugs where iterables of different lengths are silently truncated. This is a valuable safety check that aligns with Python 3.10+ best practices.

* Initial plan

* fix(test): add strict=True parameter to zip() calls in test_adamuon.py

Co-authored-by: njzjz <[email protected]>

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: njzjz <[email protected]>
@njzjz njzjz added this pull request to the merge queue Jan 12, 2026
Merged via the queue into deepmodeling:master with commit 7c043f0 Jan 12, 2026
70 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants